A Rough Sets Partitioning Model for Mining Sequential Patterns with Time Constraint

نویسندگان

  • Jigyasa Bisaria
  • Namita Shrivastava
  • K. R. Pardasani
چکیده

now a days, data mining and knowledge discovery methods are applied to a variety of enterprise and engineering disciplines to uncover interesting patterns from databases. The study of Sequential patterns is an important data mining problem due to its wide applications to real world time dependent databases. Sequential patterns are inter-event patterns ordered over a time-period associated with specific objects under study. Analysis and discovery of frequent sequential patterns over a predetermined time-period are interesting datamining results, and can aid in decision support in many enterprise applications. The problem of sequential pattern mining poses computational challenges as “a long frequent sequence” contains enormous number of frequent subsequences.Also useful results depend on the right choice of event window. In this paper, we have studied the problem of sequential pattern mining through two perspectives, one the computational aspect of the problem and the other is incorporation and adjustability of time constraint. We have used Indiscernibility relation from theory of rough sets to partition the search space of sequential patterns and have proposed a novel algorithm that allows previsualization of patterns and allows adjustment of time constraint prior to execution of mining task.The algorithm Rough Set Partitioning is atleast ten times faster than the naive time constraint based sequential pattern mining algorithm GSP. Besides this an additional knowledge of time interval of sequential patterns is also determined with the method. KeywordsData mining, Sequential patterns, indiscernibility relation, partitioning etc.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A sequential pattern mining algorithm using rough set theory

Sequential pattern mining is a crucial but challenging task in many applications, e.g., analyzing the behaviors of data in transactions and discovering frequent patterns in time series data. This task becomes difficult when valuable patterns are locally or implicitly involved in noisy data. In this paper, we propose a method for mining such local patterns from sequences. Using rough set theory,...

متن کامل

Discovering Active and Profitable Patterns with Rfm (recency, Frequency and Monetary) Sequential Pattern Mining–a Constraint Based Approach

Sequential pattern mining is an extension of association rule mining that discovers time-related behaviors in sequence database. It extends association by adding time to the transactions. The problem of finding association rules concern with intratransaction patterns whereas that of sequential pattern mining concerns with inter-transaction patterns. Generalized Sequential Pattern (GSP) mining a...

متن کامل

A Rough Set Approach in Choosing Partitioning Attributes

Data mining partitioning is performed on attributes to increase data concentration. To choose effective partitioning attributes, a rough set based technique that measures the crispiness of the partitioning is presented. This technique is based on the mathematical theory of rough sets. This method is a simple and efficient approach to measure the crispiness of the partition in data mining.

متن کامل

A Constraint Programming Approach for Mining Sequential Patterns in a Sequence Database

Constraint-based pattern discovery is at the core of numerous data mining tasks. Patterns are extracted with respect to a given set of constraints (frequency, closedness, size, etc). In the context of sequential pattern mining, a large number of devoted techniques have been developed for solving particular classes of constraints. The aim of this paper is to investigate the use of Constraint Pro...

متن کامل

Discovering Sequential Association Rules with Constraints and Time Lags in Multiple Sequences

We present MOWCATL, an efficient method for mining frequent sequential association rules from multiple sequential data sets with a time lag between the occurrence of an antecedent sequence and the corresponding consequent sequence. This approach finds patterns in one or more sequences that precede the occurrence of patterns in other sequences, with respect to user-specified constraints. In addi...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • CoRR

دوره abs/0906.4327  شماره 

صفحات  -

تاریخ انتشار 2009